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Abstract 

The inverse of the star-discrepancy N*(d,e) denotes the smallest possible cardinality 
of a set of points in [0, l] d achieving a star-discrepancy of at most e. By a result of 
Heinrich, Novak, Wasilkowski and Wozniakowski, 

N*{d,e)<c ahs de- 2 . 

Here the dependence on the dimension d is optimal, while the precise dependence on e is 
an open problem. In the present paper we prove that 

iV*(d, e )<c abs d £ - 3 / 2 (log( £ - 1 )) 1 / 2 . 

This is a surprising result, which disproves a conjecture of Novak and Wozniakowski. 

1 Introduction and statement of results 

Let x±, . . . , xn be points from the d-dimensional unit cube [0, l] d . The quantity 



D* N (x 1: 



,x N j 



sup 

xe[o,i] d 



1 N 



N ^ 

n=l 



is called the star- discrepancy of x%, . . . ,xn- Here we write A for the Lebesgue measure, and 
[0, x] for the set {y G [0, l] d : < y < x}, where a < b for two points a, b G [0, l] d means that 
no coordinate of a exceeds the corresponding coordinate of b. It is a well-known fact that 
point sets having small star-discrepancy can be used for numerical integration, since by the 
Koksma-Hlawka inequality we have 



1 N 

/ f(x) & V f(x n ) 



< D* N (xi, . . . ,x N ) ■ Var/, 



where Var/ denotes the variation of / in the sense of Hardy and Krause. The method of using 
low-discrepancy point sets for numerical integration is called Quasi-Monte Carlo method. For 
more background on discrepancy theory see [5j [151 US] • 
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For long time the main focus in the construction of low-discrepancy point sets was to achieve 
a small order of the discrepancy when d is fixed and N tends to infinity. There exist many 
constructions of point sets having a discrepancy bounded by Crf(log iV) rf_1 iV -1 . However, such 
discrepancy bounds are only useful if N is very large in comparison with d (note that the 
function (log N) d ~ 1 N~ 1 is increasing for N < e d_1 ), and consequently these constructions 
are computationally not feasible for high-dimensional integration problems. To describe this 
problem, the notion of the inverse of the star- discrepancy was introduced. Let N*(d, e) denote 
the smallest possible cardinality of a d-dimensional point set achieving a star-discrepancy of 
at most e. By a profound result of Heinrich, Novak, Wasilkowski and Wozniakowski |12| . 

N*{d,s) <c abs de~ 2 . (1) 

Comparing this upper bound with the lower bound of Hinrichs |14| . which states that 

N*{d,e) > c a bs de-\ 

we see that the dependence on the dimension d in ([1]) is optimal (cf. the title of [E]). The 
dependence on e^ 1 is an important open problem. 

Interestingly, the main idea of the proof of dU) is to show that a random point set has the 
desired properties with positive probability. More specifically, (P) is equivalent to the fact 
that for any d and N there exists a set x\, . . . , xjv of points from [0, l] d for which 

* 

D N (Xl,...,XN) < C a bs-7=, (2) 

V N 

and this upper bound holds for a random point set with positive probability for an appropriate 
choice of the constant. In fact, the probability of a random point set satisfying ([2]) is very 
large already for moderate values of c a t, s , see [2]; furthermore, the expected value of the star- 
discrepancy of a set of N points in [0, l] d is also of order y/d/y/N, see If the exponent 2 of 
e^ 1 in (pQ) was optimal, then this essentially would mean that for moderate iV in comparison 
with d no deterministic construction of a point set could outperform the typical performance 
of a random point set with respect to the star-discrepancy. Intuitively this seems to be 
reasonable, since constructing low-discrepancy point sets for moderate iV in comparison with 
d seems to be a tremendously difficult problem (for example, we do not know how to construct 
point sets for which ^) holds; in contrast, if N is very large in comparison with d, then 
typically net-like constructions achieve low discrepancy, see [3]). Novak and Wozniakowski 
conjectured that the exponent 2 of e^ 1 in ([T]) is optimal. In [181 p. 63] they write: 

How about the dependence on This is open and seems to be a difficult 

problem. [. . . ] We think that as long as we consider upper bounds of the form 
N*(d,e) < Cd k e~ a , the exponent a > 2 and 2 cannot be improved. 

See also [TTJ, Open problem 7] and [TTJ Problem 3]. 

It is the purpose of the present paper to prove that the exponent 2 of e^ 1 in ([1]) is not optimal. 
In fact, the exponent 2 of e" 1 in ([1]) can be reduced to any exponent a > 3/2. More precisely, 
we will prove the following theorem. 
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Theorem 1. The inverse of the star-discrepancy N*(al,e) satisfies 

N*(d,e) < c abs de- 3 / 2 (log e- 1 ) 1/2 , e < 1/2. 
Theorem [JJ is a direct consequence of the following lemma. 

Lemma 1. Let N be a positive integer, and set M = [d 1 / 3 N 2 / 3 (log(N/ d)) 2 / 3 ~\ . Then there 
exists a set x\, . . . , xn+m of N + M points in [0, l] d such that 

n * f w d 2 / 3 {\og{N/d)fl 3 

L> N+M\ x l-, • • • ,XN+M) < C a bs • \&) 

The main idea of the proof of Lemma [1] is to construct the sequence satisfying ([3]) in two 
randomized steps. In the first step, we take N independent, identically distributed (i.i.d.) 
random variables having uniform distribution on [0, l] d . Using probabilistic arguments, we 
show that we can choose a realization x±, . . . , ajjy of these random variables having "nice" 
discrepancy properties. In the second step we consider M additional i.i.d. random vari- 
ables Yi, . . . , Ym, which are not uniformly distributed, but have a different distribution G(x). 
This distribution is chosen in such a way that the expected value of ^2,^=1 l[o,x](^n) can- 
cels out the deviation between the distribution of x±,. . . ,xn and the uniform distribution. 
Furthermore, since M is much smaller than N, the variances of ^2^=i ^[o,a;](^n) are small in 
comparison with the total number of points N + M. I have not seen such a two-step random 
construction, consisting of an entirely random and an additional semi-dependent part, before, 
so probably this is a new idea. In Section [3] below the key points of the argument will be 
described in more detail again. The proof of Lemma[T]will be given subsequently in Section^ 

It is difficult to say whether the exponent 3/2 in Theorem Q] could be optimal. I did not find 
a way to improve the method of proof of Theorem [TJ leading to a further reduction of this 
exponent. The presence of the logarithmic term in Theorem Q] is annoying, but I do not see 
how to get rid of it. The quintessence of Theorem [1] is the following: for any choices of d and 
N, it is possible to outperform random points with respect to the discrepancy; it always is 
possible to find deterministic points which perform significantly better than typical random 
points. In this context, it should be noted that constructing point sets achieving discrepancy 
bounds of the form ([JJ or (J3J) seems to be a very hard problem; furthermore, although random 
point sets achieve ([T|) with large probability (see [2]), it is a very hard problem to calculate 
the discrepancy of a given point set in the high-dimensional setting (see OS])- The proof of 
Theorem [1] is non-constructive, just as the proof of ([T|). However, ([T|) is a result concerning 
purely random, uniformly distributed random variables, which can be efficiently sampled even 
in high dimensions. In contrast, the distribution function defining the random variables used 
in the second step of our proof of Theorem [1] depends on the empirical distribution of the 
random variables in the first step, and calculating this empirical distribution and sampling the 
random variables of step two accordingly is (in a high-dimensional setting) computationally 
infeasible. For more information on the tractability of multi-dimensional Quasi-Monte Carlo 
integration see [U Q3] . 



2 Preliminaries 

For any x G [0, we set 



I[o,x] = ho,x] -A(M). (4) 
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We will sometimes write and 1 for the d-dimensional vectors (0, . . . , 0) and (1, . . . , 1), re- 
spectively. 

For any 5 G (0, 1] a set T of points from [0, l] d is called a 5-cover of [0, l] d if for every point 
y G [0, l] d there exist x, z G T U {0} such that x <y < z, and A([0, z]\[0, x\) < 5. 

Lemma 2 ([7, Theorem 1.15]). For any d and any 5 there exists a 5-cover of [0, l] d which 
has cardinality at most (2e) d (<5 _1 + lj d . 

Lemma 3 (Bernstein's inequality; see e.g. [2Q|, Lemma 2.2.9]). Let Z\,...,Zn be i.i.d. 
random variables, satisfying EZ n = and \ Z n \ < C a.s. for some C > 0. Then for any t > 



N 
n=l 



> t < 2 exp 



2(En=iEZ2)+2Ct/3 



The following lemma is a special case of [12\ Theorem 2], which is the central ingredient in 
the original proof of (pQ) . It follows from deep results of Talagrand [19] and Haussler [TD] . 

Lemma 4. There exists an absolute constant K such that the following holds: Let H(x) be 
a probability distribution function on [0, l] d , and let Y\, . . . ,Ym be i.i.d. random variables 
having distribution H . Then for all t > Kyd we have 



sup 

.zg([o,i]nQ) d 



M 



n=l 



> 



< 



1 (Kt 



M 



t 



2\d 



d 



-2t 2 



3 Main idea 

Let N be given, and set M = \d 1 ^ 3 N 2 / 3 (log(N/d)) 2 ^ 3 ] . Let xi,...,xn be points having 
"nice" discrepancy behavior (we will use a probabilistic argument to show that points having 
the desired properties exist; see Lemma [5] below for details). For any x G [0, l] d we set 



\n=l / \ n=l 



[0,x]{x n ) ~ A([0,x]). 



Then F(x) completely describes the distribution of xi, . . . ,xn- Note that 

D* N (xi, . . . ,x N ) = sup \F(x)\. 

We want to construct M random points Yi, . . . , Ym in [0, l] d , such that the joint distribution 
of x\, . . . ,xjy,Yi, . . . , Ym is as close as possible to the uniform distribution. Heuristically, we 
could say we want Y\ , . . . , Ym to have the distribution function 



A([0,*]) 



NFjx) 
M 



(5) 



Then we can expect that an axis-parallel box [0, x] contains 

JV 



jri M (x n ) + M (\{[Q,x]) 

n=l ^ 



NF(x) 
M 



(N + M)X([0,x\) 
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points oi xi, . . . ,xn,Y]_, . . . , Ym, and the deviation between the actual number of points and 
this expected value will only be caused by the variance of the M random variables Yi,..., Ym- 
Since M is much smaller than N, this method yields significantly smaller errors than con- 
structing all points i.i.d. randomly However, we have to be careful, since the function in 
([5|) is not a distribution function (specifically, this function is not increasing). To overcome 
this problem, we have to construct the distribution function defining Y±, . . . ,Ym in a more 
sophisticated way; namely, such that it is in fact a distribution function (and in particular 
increasing in each coordinate), and that it is still "close" to the function in ([5]). 

Set 6 = d 2 / 3 N~ 2 / 3 , and let T be a 5-cover of [0, l] d . Furthermore, set 

G(x) = min ^ max ^ (a([0, 7 ]) - , l} , x E [0, 1]- 

Then by construction the function G(x) is monotonically non-decreasing and right-continuous 
in each coordinate. If we assume that no coordinate of a point x\,...,xn is zero, then 
F(x) = whenever a coordinate of x is 0. Consequently, G{x) only takes values between 
and 1, satisfies G(l) = 1, and G(x) = if at least one coordinate of x is 0. Thus G(x) is a 
distribution function, and we can sample Y±, . . . , Ym with respect to G. If we can show that 
the distance between G(x) and the function in ([5]) is small, then the expected value of the 
Y n 's essentially annihilates the deviation of the distribution of x\, . . . , xjv from the uniform 
distribution, and the deviation between the distribution of x\, . . . , xn, Fl, . . . , Ym and the 
uniform distribution is only caused by a) the variance of the random variables Y\, . . . , Ym, and 
b) the deviation between G(x) and the function in ([5]). Assuming again that the deviation 
mentioned in b) is small (this is the crucial ingredient of the whole proof; it is proved in 
Lemma [5] below that we can choose the deterministic sequence x\, . . . ,xn in such a way that 
this deviation in fact is small), Lemma 2] proves that with positive probability in each axis- 
parallel box [0,x] the number of points Y±,... ,Ym is approximately MG(x) ± \fMyd, and 
thus the total number of points x\, . . . ,xn, Yi,... , Ym in [0, x] is approximately 

N 

l[o,s](a») + MG{x) ± \[M\Td =(N + M)A([0, x]) ± VMy/d. 

n=l 

Since \fM\fd w d 2 / 3 iV 1//3 (log(iV / d)) 1 ^ 3 , the discrepancy of xi, . . . , xjy, Y\, . . . , Ym is with pos- 
itive probability bounded by = d^N^logiN/d)) 1 / 3 . 



4 Proof of Lemma [T] 

Let N > 1 be given, and let X\, . . . , Xn be i.i.d. random variables having uniform distribution 
on [0, l] d . Throughout of the rest of the paper we will assume that N > 20d, since otherwise 
Lemma Q] is trivial. Set 

M = \d 1 ' 3 N 2 l 3 (\og{N/d)f' 3 '\. 

Furthermore, set 5 = d 2 / 3 N~ 2 / 3 , and let V be a <5-cover of [0, l] d for which additionally E T 
and 1 E r hold. By Lemma [2] we can choose T such that 

#r < {2e) d (5- 1 + l) d + 2< (Ae) d (N/d) d < (N/d) 2d = e M1 °sW d ). ( 6 ) 
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For x £ [0, l] d we set 



where the functions I were defined in and 



F(x) = A([0,x]) 



Furthermore, we define 



NF(x) 
M 



G(x) = min < max F(x), 1 

7er, 7 <z 



Note that F(x), F(x) and consequently also G(x) are random objects, which depend on the 
random variables X\, . . . , Xjy. 

Lemma 5. With positive probability, we have 

max 
xe[o,i] d 

For any 71,72 G T, satisfying 71 < 72, we define events 

^71,72 = (^(7i) " H12) > lGd^N-^ilogiN/d))- 1 / 3 
We will deduce Lemma [5] from the following Lemma El 
Lemma 6. For any 71,72 eT, 71 < 72, we have 

F(^ 71 , 72 ) < 2e- edlo ^ N / d \ 
Proof of Lemma® Let 71,72 £ T, 71 < 72, be fixed. First assume that 

A([0,7 2 ]\[0,7iD < mWN-WQogiN/d))- 1 ' 3 . (7) 

We want to apply Lemma [3] for the random variables I[ 0j72 ]\[o i7i ] (X n ), n = 1, . . . , N. Note 
that these random variables are independent and identically distributed, have expectation 
zero and variance 

A([0, 72 ]\[0,7i])(l - A([0,7 2 ]\[0,7iD) < A([0, 72 ]\[0, 71]) < lQd^N^ilogiN/d))- 1 / 3 . 
Hence, using Lemma [3] and the definitions of F(x) and M, we obtain 

N 



71:72/ 



< 



< 



y^ I [0, 72 l\[0,7il(^"/ 
n=l 
N 

y^ I [0,72l\f0,7il(^"/ 
n=l 



< 2exp 



> MA([0, 72 ]\[0,7i]) + WMd^N-^ilogiN/d))- 1 / 3 

> Wd^N^ilogiN/d)) 1 / 3 ^ 
256d 4 / 3 N 2 / 3 (log(N/d)) 2 / 3 \ 



32iVd 1 /3jV- 1 /3(iog(jVAf))- 1 /3 + 32d 2 /3 7 yi/3( lo g( A r/ d ))i/3/ 3 



< 2e -6dlog(Af/d)_ 

Here we used 

which holds for N > d. 



(log(7V/d)) 2/3 < iV 1 / 3 ^ 1 / 3 , 



Now assume that (|7|) does not hold, which implies that 

A([0,7 2 ]\[0,7iD > m^N-WQagiN/d))- 1 ' 3 . 
Then by Lemma [3] we have 



71:72/ 



< 



< 



N 

5Z I [°'72]\[0,7i]( X ™) 

71=1 

N 

^2 I [0,72]\[0, 71 ] ( X n) 

71=1 



> 



MA([0,7 2 ]\[0,7i]) + 16Md 1 / 3 AT- 1 /3( log ( iV /d))- 1 /3 



>MA([0,7 2 ]\[0,7iD 



< 2 exp 

< 2 exp 



(MA([0,7 2 ]\[0,7i])y 



2ATA([0, 72 ]\[0,7 1 ]) + 2MA([0,7 2 ]\[0,7 1 ])/3 / 
M 2 A([0, 72 ]\[0,7i])~ 



< 2 exp 

= 2e -6dlog(N/d) 



8N/3 

d2/3^4/3(j og ( J y/ d ))4/3 16d l/3 iV -V3( log ( JV y d j)-l/3' 



8iV/3 



Thus we have, no matter whether assumption ([7|) holds or not, that in any case 

F(A 7li72 ) < 2e- edlo ^ N / d \ 

This proves Lemma [6) 



□ 



Proof of Lemma 0. By §6§ we have at most e 2dl °s( N / d ) possible choices for 71 G T, and for 
any fixed 71 additionally at most e 2 ^og(N/d) possible choices for 72 £ T. Thus, setting 



A= |J A 



71 >72 J 



7l>72Gr, 
7l<72 



we have 



P(y4) < 2e" Mlog(Ar/d) e 4 ' ilog(7V/d) < 1/2 
(remember that we have assumed that iV/d > 20). 

Now assume that x G [0, l] d . By the definition of a 5-cover, there exist 71,72 6 T such that 
7i < x < 72 and 

A([0, 72 ]\[0,7i]) <<5. 



7 



We have 







G{x) - F(x) 


< 



G{x) — max F(x) 



+ 



max ^(7) — F{x) 

7GT, "f<x 



max < max FM — 1, > + 

=:Ti(ai) 



max ^(7) — F(x) 

7GT, 7<x 

V v 

=3b(as) 



Since F(l) = 1, we have 



Ti(x) < max max F( 7 ) -F(l), 0} = max{T 2 (l), 0} < |T 2 (1)|. 

7ST, 7<1 



Furthermore, 



T 2 (x) < ( ( 7g max^ F( 7 ) ) - F( 72 ) ) + ( F{ l2 ) - F(x) ) , 



where 



(N 
1 + Tf) A([0, 72 ]\[0,x]) - X> [0 , 72 ]\ M (X 
' n=l 



< 2(i 2 / 3 iV 1 /3 M -i 

< 2d 1 / 3 Af- 1 / 3 (log(iV/d))- 2 / 3 , 



and 



T 2 (x) < F(x)-F( 7l ) 

( N\ 1 N 

= 1 + m A([ °' Xm 7l]) " M E 1 [0--]\[0m] (** 

^ 7 n=l 

< 2d 1 / 3 Af- 1 /3( log ( A r/ (i ))-2/3 > 



Note that on (that is, on the complement of A) we have 



(8) 



0) 



(10) 



(11) 



(12) 



max F( 7 ) J - F( 72 ) < 16d 1/3 iY" 1/3 (log(Ar/d))- 1/3 

7£T, 7<72 / 

(independent of the value of 72). Consequently, combining (fTUj) , ([TT]) and (fT2"jh on ^4^ we also 
have 

|T 2 (x)| < 18d 1/3 iV- 1/3 (log(Af/d))" 1/3 , (13) 

for all x G [0, l] d (and in particular for x = 1, which by © gives an upper bound for Ti). 
Thus, by @, © and (jig]) , on A c we have 



G(x)-F(x) < 36d 1/3 iV- 1/3 (log(iV/d))- 1/3 , (14) 
for all x 6 [0, l] rf . Note again that P(A ) > 1/2. Consequently we have proved Lemma[5] □ 



8 



Proof of Lemma [0 We could use the method from pQ , in order to obtain an explicit value for 
the absolute constant in Lemma [T] (and accordingly also for the absolute constant in Theo- 
rem [1]). For the sake of brevity we take a shortcut and use Lemma 2] instead. 

For a (deterministic) set xi, . . . , xn of N points in [0, l] d and for any x G [0, l] d , set 



F(x) 



F(x) = X([Q,x]) 



NF(x) 



and 



G(x) = min < max F(x),l 

7£r, ■yKx 



By Lemma [5] we can choose x\, . . . ,xn in such a way that 

G(x) - F(x) < 36d 1/3 iV~ 1/3 (log(iV/(i))- 1/3 . 



max 

x£[0,l] d 



(15) 



Additionally we can assume that no coordinate of any point x n , 1 < n < N, is (the 
event that a coordinate of X n is has zero probability). Now let the points x%, . . . , x^, 
satisfying these conditions, be fixed. As mentioned in Section [3l it is easily seen that G(x) 
is the distribution function of a probability distribution. We have to check the following four 
requirements: 

• G(x) is monotonically non-decreasing in each of its coordinate variables. This is true 
by construction. 

• G{x) is right-continuous in each of its coordinate variables. This property is inherited 
from the accordant property of the function F(x). 

• G(x) = if any coordinate of x is 0. Since we assumed that no coordinate of any 
point x\, . . . ,xn is 0, this implies that for any x which has a zero coordinate we have 
F(x) = 0. Consequently, for such an x we have A([0, x]) — NF(x)/M = 0, which proves 
G{x) = 0. 

• G(l) = 1. This is a consequence of the fact that F(l) = 0. 

Let Y\, . . . ,Ym be i.i.d. random variables having distribution G(x). Using Lemma [H for 
t = c\K\fd for some c\ > 1, we get 



sup 

.a;G([0,l]nQ) d 



1 M 



n=l 



> 



Cl 



< 



Kyfd 



-2clK 2 d 



< 1, 



provided c\ is chosen sufficiently large (independent of d). Thus it is possible to choose M 
points acjv'+ii • • • > xn+ai such that 



sup 

xe([o,i]nQ) d 



^ N+M 



n=N+l 



< 



Cl 



KVd 



M 



(16) 
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Combining (|15p and (I16p . multiplying by M and using the definition of I we get 

< aKy/dVM + 36Md 1/3 iV" 1/3 (log(iV/d))" 1/3 . 



sup 

ze([o,i]nQ) d 



N+M 

NF(x)+ W a 

n=N+l 



Thus for any x € ([0, 1] n Q) d we get 



N+M 



J, 7, 



n=l 



AT N+M 

5^I[o,*]0cn) + X] 1 

[0,x] (•^n / 

n=l n=7V+l 



=NF(x) 



< ciKVdVM + 36Md 1/3 N- 1/3 (log{N/d)y 1/3 

< c abs d 2 / 3 iV 1 / 3 (log(iV/d)) 1 /3. 



(17) 



Note that the value of the star-discrepancy does not change if we consider only axis-parallel 
boxes of the form [0, x] for rational points x E ([0, 1] n Q) d . Thus (fT7|) implies 



D 



N+M 



{x u . . .,x N+M ) < CtocPfiN-WQogiN/d)) 1 ' 3 , 



which proves the lemma. 



□ 
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